Fast Robust Inverse Transform SAT and Multi-stage Adaptation
نویسندگان
چکیده
We present a new method of Speaker Adapted Training (SAT) that is more robust, faster, and results in lower error rate than the previous methods. The method, called Inverse Transform SAT (ITSAT) is based on removing the di erences between speakers before training, rather than modeling the di erences during training. We develop several methods to avoid the problems associated with inverting the transformation. In one method, we interpolate the transformation matrix with an identity or diagonal transformation. We also apply constraints to the matrix to avoid estimation problems. We show that by using many diagonal-only transformation matrices with constraints we can achieve performance that is comparable to that of the original SAT method at a fraction of the cost. In addition, we describe a multi-stage approach to Maximum Likelihood Linear Regression (MLLR) unsupervised adaptation and we show that is more e ective than a single stage regular MMLR adaptation. As a nal stage, we adapt the resulting model at a ner resolution, using Maximum A Posteriori (MAP) adaptation. With the combination of all the above adaptation methods we obtain a 13.6% overall reduction in WER relative to Speaker Independent (SI) training and decoding.
منابع مشابه
Fast robust inverse transform speaker adapted training using diagonal transformations
We present a new method of Speaker Adapted Training (SAT) that is more robust, faster, and results in lower error rate than the previous methods. The method, called Inverse Transform SAT (ITSAT) is based on removing the differences between speakers before training, rather than modeling the differences during training. We develop several methods to avoid the problems associated with inverting th...
متن کاملRapid unsupervised speaker adaptation robust in reverberant environment conditions
We expand the conventional rapid adaptation based on Nclosest speakers sufficient statistics (suff stat) to achieve robustness under reverberant conditions. We integrated our fast dereverberation technique based on optimized multi-band spectral subtraction as pre-processing. This removes the late reflection components of the reverberant signal effectively and fast. Speakers’ suff stat are then ...
متن کاملIs the Sharp Adaptation Transform more plausible than CMCCAT2000?
The modified Bradford chromatic adaptation transform (CMCCAT2000) is a von Kries type model of adaptation that best accounts for a variety of corresponding colour data sets. The transform works in three stages. First, XYZs are linearly mapped to a new ’RGB’ space. The RGB sensitivities are somewhat like the cones but have their sensitivity concentrated in narrower regions of the visible spectru...
متن کاملFast inverse transform sampling in one and two dimensions
We develop a computationally efficient and robust algorithm for generating pseudo-random samples from a broad class of smooth probability distributions in one and two dimensions. The algorithm is based on inverse transform sampling with a polynomial approximation scheme using Chebyshev polynomials, Chebyshev grids, and low rank function approximation. Numerical experiments demonstrate that our ...
متن کامل